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Abstract 

The genus Sulfobacillus is a cohort of mildly thermophilic or thermotolerant acidophiles within the phylum Firmicutes and 
requires extremely acidic environments and hypersalinity for optimal growth. However, our understanding of them is still 
preliminary partly because few genome sequences are available. Here, the draft genome of Sulfobacillus thermosulfidoox- 
idans strain ST was deciphered to obtain a comprehensive insight into the genetic content and to understand the cellular 
mechanisms necessary for its survival. Furthermore, the expressions of key genes related with iron and sulfur oxidation were 
verified by semi-quantitative RT-PCR analysis. The draft genome sequence of Sulfobacillus thermosulfidooxidans strain ST, 
which encodes 3225 predicted coding genes on a total length of 3,333,554 bp and a 48.35% G+C, revealed the high degree 
of heterogeneity with other Sulfobacillus species. The presence of numerous transposases, genomic islands and complete 
CRISPR/Cas defence systems testifies to its dynamic evolution consistent with the genome heterogeneity. As expected, S. 
thermosulfidooxidans encodes a suit of conserved enzymes required for the oxidation of inorganic sulfur compounds (ISCs). 
The model of sulfur oxidation in S. thermosulfidooxidans was proposed, which showed some different characteristics from 
the sulfur oxidation of Gram-negative A. ferrooxidans. Sulfur oxygenase reductase and heterodisulfide reductase were 
suggested to play important roles in the sulfur oxidation. Although the iron oxidation ability was observed, some key 
proteins cannot be identified in S. thermosulfidooxidans. Unexpectedly, a predicted sulfocyanin is proposed to transfer 
electrons in the iron oxidation. Furthermore, its carbon metabolism is rather flexible, can perform the transformation of 
pentose through the oxidative and non-oxidative pentose phosphate pathways and has the ability to take up small organic 
compounds. It encodes a multitude of heavy metal resistance systems to adapt the heavy metal-containing environments. 
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Introduction 

In extremely acidic environments such as acid mine drainage 
(AMD), low pH, high toxic element concentrations and low levels 
of organic materials make growth conditions harsh. While these 
conditions are toxic for most of prokaryotic and eukaryotic 
organisms, some bacteria and archaea are not only resistant to but 
also able to metabohze the toxic compounds present [1]. Members 
of the Sulfobacillus genus are typical examples and frequently occur 
in AMDs, acid hot springs, and hydrothermal vents, as Sulfobacillus 
benefaciens [2] and Sulfobacillus acidophilus [3]. TaxonomicaUy, the 
genus Sulfobacillus, along with the genus Thermaerobacter, have 
tentatively been assigned to a family, " Clostridiales family XVII 
incertae sedis", which may form a deep branch within the phylum 
Firmicutes or may form a new phylum [4]. Until now, five species 
have been isolated and assigned to the genus Sulfobacillus [5], all of 
which are mildly thermophilic or thermotolerant acidophiles, 
which grow optimally in mixotrophic media containing inorganic 
sulfur compounds (ISCs), mineral sulfides and organic matters [6] . 



Some of these species that have been tested also have the ability of 
iron oxidation [3,7]. 

Over the past few years, an increasing number of acidophUe 
genomes have been sequenced. There are now at least 56 draft or 
completely sequenced genomes of acidophUies including 30 
bacteria and 26 archaea, providing a first glimpse of the genomics 
of acidophilic life over a range of environmental conditions [8] . 
However, most of these genomes belong to Gram-negative 
acidophiles, till today only S. acidophilus in the genus Sulfobacillus 
has been investigated genomically, revealing that Sulfobacillus 
exploits a surprisingly different enzymatic repertoire for energy 
and carbon metabolism compared with the Acidithiobacillus 
counterparts. For example, S. acidophilus lacks homologues for 
rusticyanin typically found in iron-oxidizing acidophUies [9], 
suggesting different components for electron transport in the iron 
oxidation. Furthermore, members of the genus Sulfobacillus also 
exhibit remarkably different abilities of iron (II) oxidation and 
suffide oxidation [10]. However, a comprehensive understanding 
of the metabolic versatility and environmental adaptations of 
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Sulfobacillus will require comparative genomic analysis with more 
members of the genus Sulfobacillus. 

In this study, S. thermosulfidooxidans strain ST isolated from an 
acid hot spring in Tengchong, Yunnan (Southwestern China) 
presents interesting physiological and metabolic capacities (Fig. S 1 
in FUe SI and Table 1). A deep coverage draft genome of S. 
thermosulfidooxidans strain ST was sequenced and analyzed. It was 
next compared to the genomes of other thermophUes to explore 
the physiology of S. thermosulfidooxidans at the whole genome level. 
Furthermore, semi-quantitative RT-PCR analysis was performed 
to verify the expressions of key genes related with iron and sulfur 
oxidation. A detailed analysis of energy metabolism and central 
carbon metabolism in S. thermosulfidooxidans was described. This 
genome exploration revealed a very dynamically evolving genome 
contributing to an unexpected physiological versatility. 

Materials and Methods 

Growth of S. thermosulfidooxidans Strain ST and DNA 
Extraction 

Sulfobacillus thermosulfidooxidans strain ST was grown at 45°C 
aerobically in 9 K basal salt medium [1 1] with 4.5% (w/v) ferrous 
sulfate. Microbial cells were harvested by centrifugation (12,000 
rmp) for 10 min at 4°C. Genomic DNA was extracted from the 
pelleted cells using TIANamp Bacteria DNA kit (TIANGEN, 
China) according to the manufacturer's instruction and finally 
suspended in MilliQ^ water. The genomic DNA was stored in — 
80°C until used for genome sequencing. 

RNA Extraction and Semi-quantitative RT-PCR Analysis 

45 g/L ferrous sulfate and 10 g/L elemental sulfur (S") were 
separately used as substrate in the cultivation. Microbial cells were 
harvested in mid exponential growth phase. Total RNA was 
extracted using TRIzol reagent (Invitrogen, Carlsbad, USA), 
treated with RNase-free DNase I (Qiagen, Valencia, USA) and 
purified with a RNeasy kit (Qiagen, Valencia, USA). Then, single- 
stranded cDNA was synthesized with ReverTra Ace qPCR RT 
Kit (Toyobo, Japan), according to the manufacturer's protocol. 



The cDNA was stored at — 80°C until used for semi-quantitative 
RT-PCR analysis. 

Primers targeting selected genes putatively involved in ISC 
metabolism and iron oxidation were designed for semi-quantita- 
tive RT-PCR (Table SI in File SI). The semi-quantitative RT- 
PCR was performed in 25 |J,1 reaction mixture containing 12.5 |J,1 
universal Taq PCR Master Mix (Tiangen Biotech, China), 0.5 |.tl 
single-stranded cDNA, and 1 |J.l each of 10 |J.M forward and 
reverse primers, and 10 |J,1 deionized water. The specific 
amplification protocol was as follows: 95°C for 5 min, then 40 
cycles of 95°C for 20 s, 55°C for 15 s, and 72°C for 15 s and a 
final incubation of 72°C for 10 min. PCR products were visualized 
on 2% agarose gels and sequenced bidirectionally. 

Genome Sequencing, Assembly, Annotation 

Genomic library construction, sequencing, and assembly were 
performed at the Beijing Genomics Institute (BGI; Shenzhen, 
China) using lUumina Hiseq 2000 sequencing platform and 
yielded approximately 350 Mb sequence information. Finally, the 
raw reads were assembled into 53 supercontigs using SOAPde- 
novo package [12]. Coding sequences (CDSs) were predicted with 
the ORF finders Glimmer [13]. All CDSs were manually curated 
and verified by comparison with the publicly available databases 
NCBI non-redundant [14], KEGG[15], COG[16] using the 
annotation software BLAST[17]. The unassigned CDSs were 
further annotated using the hmmpfam program of the HMMER 
package 26.0 [18]. The hidden Markov models for the protein 
domains were obtained from the Pfam database 26.0. And the 
identifications of tRNA and rRNA were performed using the 
tRNAscan-SE [19] and RNAmmer programs [20], respectively. 
CRISPR loci were identified using CRISPRFinder [21]. Trans- 
porter gene annotations were performed by additionally taking 
into account the information of transporter classification database 
[22]. Detailed information on the genes ordered by functional 
category was summarized in Tables S2, S3, S4, and S5 in FUe SI. 
The whole genome shotgun project has been deposited at DDBJ/ 
EMBL/GenBank under the accession number PRJNA203261. 



Table 1. General features of the S. tliermosulfidoxidans genome in comparison with other Clostridiales family XVII incertae sedis 
genomes. 





Organism 


Sulfobacillus thermosulfidooxidans 
ST 


Sulfobacillus acidophilus 
DSM 10332 


Sulfobacillus 

acidophilus 

TPY 


Thermaerobacter 
arianensis DSM 
12885 


Habitat 


Acid hot spring 


Acidic sulfidic 
and sulfurous 
sites 


Hydrothermal 
vent in the 
Pacific Ocean 


Mud from the 
Mariana Trench in 
the Pacific Ocean 


Temperature range 


Moderate 

thermophilic (48 C optimum} 


Moderate thermophilic (50"C 
optimum) 


Moderate 
thermophilic 
{Approximately 50"C) 


Thermophilic (50-80"C) 


pH range 


1.2-2.4 


1.6-2.3 


1.6-2.3 


5.4-9.5 


Motility 


Non-motile 


Non-motile 


motile 


Non-motile 


Nutrition type 


Mixotrophic 


Mixotrophic 


Mixotrophic 


Chemoheterotroph 


Genome size in Mb 


3.33 


3.56 


3.55 


2.84 


G+C content 


48.35% 


56.8% 


56.8% 


72.5% 


Protein-coding genes 


3225 


3585 


3837 


2435 


16S-23S rRNA genes 


1 


5 


5 


2 


Number of tRNAs 


48 


69 


52 


60 



doi:1 0.1 371 /journal.pone.009941 7.t001 
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Genome Synteny Comparisons 

Pairwise alignments for dot plot representations were performed 
on six-frame amino acid translation of the genome sequences of 
Sulfobacillus thermosulfidooxidans, Sulfobacillus acidophilus and Thermaer- 
obacter marianensis using the Promer program in the MUMmer 3.23 
package [23]. The default parameters in all analyses were applied, 
so exact matches longer that six amino acids were identified and 
adjacent exact matches were joined if a gap no longer than 30 
amino acids occurred. And the resulting clusters were further 
processed if their matches were longer than 20 amino acids and 
then aligned using a BLOSUM62 amino acid substitution matrix. 
Furthermore, the GGDC-Genome-to-Genome Distance Calcula- 
tor was used to estimate the overall similarity among the three 
genomes. The inferred digital DNA-DNA hybridization values 
were calculated [24]. 

Calculation of COGs for Venn Diagrams 

The predicted proteome sequences of selected genomes [S. 
acidophilus and T. marianensis) except for S. thermosulfidooxidans were 
retrieved from the NCBI database. The best sequence similarities 
were obtained by BLAST against COG database using maximal 
E-value= le Proteins that were not group into COGs were 
represented as specific proteins for each organism. The calculation 
of COGs was performed and visualized with the R package [25] . 

Phylogenetic Analyses 

Predicted amino acid sequences of selected genes were aligned 
with reference sequences using multiple sequence alignment tool 
ClustalW 2.0 [26]. If not mentioned otherwise, phylogenetic trees 
were constructed using Molecular Evolutionary Genetics Analysis 
4.0 software (MEGA, version 4.0). 

Results and Discussion 

Sequencing and Automatic Annotation of the Strain ST 
Genome 

The draft genome sequence of S. tliermosulfidooxidans strain ST 
contained 53 supercontigs, ranging from 507 bp to 547,747 bp 
(the N50 and N90 contig sizes are 376,394 bp and 36,184 bp, 
respectively), with a total length of 3,333,554 bp. Since most 
contigs end with repeated sequences, further assembly was not 
possible with current data. However, given that the present draft 
has lOOx sequence coverage, it is reasonable to assume that the 
majority of genes in genome of the strain ST are identified from 
the current draft. It differs from those members of the Clostridiales 
family XVII incertae sedis by its lower overall G+C content (48.35%) 
(Table 1). Furthermore, within 3225 predicted open reading 
frames (ORFs), 704 ORFs (22%) are annotated as hypothetical 
proteins and more than 1 1 90 predicted proteins of S. thermo- 
sulfidooxidans do not have the best hits within the members of 
Clostridiales family XVII incertae sedis. Likewise, few larger syntenic 
regions were observed between the S. thermosulfidooxidans genome 
and those of other Clostridiales family XVII incertae sedis (Fig. 1). The 
inferred digital DNA-DNA hybridization value by the GGDC- 
Genome-to-Genome Distance Calculator for S. thermosulfidooxidans 
and S. acidophilus is <30%. Altogether, these findings suggest that 
the gene complement of S. thermosulfidooxidans is significantiy 
different from those of other Cbtridiales family XVII incertae sedis. 
Furthermore, we also performed a comparative COG analysis 
among S. thermosufidooxidans, T. marianensis and S. acidophilus (Fig. 
S2 in File SI). In general, S. thermosulfidooxidans shared more 
orthologous genes with S. acidophilus than T. marianensis. But more 
than one third of genes in S. thermosulfidooxidans genome are unique, 
which suggests that some physiological features of S. thermo.mlfi- 



dooxidans differ a lot from the other two bacteria. Besides, one 
rRNA operon containing 16S, 23S and 5S rRNA genes was found 
in the draft genome of S. thermosufidooxidans strain ST (Table S2 in 
File SI). The 16S rRNA sequence oiS. thermosufidooxidans has 93% 
and 86% nucleotide identity to the 16S rRNAs of 5. acidophilus and 
T. marianensis. In addition, S. thermosulfidooxidans also encodes a 
similar collection of information processing genes like the other 
Sulfobacillus members (Table S2 in File SI). 

Integrative Elements, CRISPR Defence System 

More than 47 transposase genes with transposase signatures are 
identified in the S. thermosufidooxidans genome and can be assigned 
to at least nine different types of transposase (Table S3 in File SI) 
that are associated with insertion sequence families IS3, IS4, IS21, 
IS66, ISl 10, IS200, IS605, IS1477 and ISChy4 [27]. Interestingly, 
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Figure 1. Dot plot representation of the pairwise alignments of 

the 5. thermosulfidooxidans, S. acidophilus and T. marianensis 
genomes. Alignments were performed on the six-frame amino acid 
translation of genome sequences using the Promer program in the 
iVlUiVlmer package. In all plots, every dot indicates a match at least six 
AA between the two genome sequences being compared, with forward 
matches colored in red and reverse matches colored in blue. 
doi:1 0.1 371/journal.pone.009941 7.g001 
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the IS element of S. thermosulfidooxidans belonging to group IS605 
occurs 14 copies in the genome, but they have best BLAST hits 
with 1 1 different genera excluding the genus Sulfobacillus. 
Furthermore, most of these IS elements occur in poorly conserved 
genomic regions, and several elements appear to have inserted into 
protein-encoding genes indicating that they significantly contrib- 
ute to genome evolution. Although twenty-seven transposase genes 
have best hits within the genus Sulfobacillus, the organizations 
surrounding the IS elements are significandy different from those 
of S. acidophilus indicating that these IS elements markedly changed 
genomic structure of S. thermosulfidooxidans. Except for IS elements, 
some other genes relative with DNA transposition were also 
identified in the S. thermosulfidooxidans genome (Table S3 in File SI). 
For example, a site-specific recombinase that can catalyze 
sequential DNA strand exchange reactions [28] was found in the 
genome. Besides, a cassette chromosome recombinase gene was 
also identified in the genome (Table S3 in File SI). This cassette 
chromosome recombinase typically controls resistance gene 
transmissions [29]. These site-specific recombinases may also 
greatly contribute genomic variability and physiological versatility. 

Interestingly, CRISPR/Cas (clustered regularly interspaced 
short palindromic repeats/CRISPR-ass()ciat(;d genes) viral de- 
fence" S5'stc"ms [30,31,32] were identific'd in the S. thennosufidoox- 
idans genome (Table S4 in File SI). Eight cas genes orienting in the 
same direction formed a typical CRISPR/Cas system with 44 
spacers in the downstream genomic sequence. According to a 
recent classification of the CRISPR/Cas systems into three major 
t^pes (I-III), the CRISPR/ Cas systems in S. thmnosufidooxidans can 
be assigned into iypc I CRISPR/Cas systems that is proposed to 
function in virus defence by direcdy targeting DNA [33]. 
Moreover, six CRISPR-associated cmr genes orienting in the 
opposite direction with cas genes, all belonging to the repair- 
associated mysterious protein (RAMP) superfamily [34], are 
identified in the immediate upstream of cas genes. These cmr 
genes may compose a CRISPR viral defence system with 39 
spacers in the further upstream, though the functions of these cmr 
genes are largely unknown [33]. Furthermore, the similarities of 
these cmr genes supported the view that they may be obtained by 
gene horizon transfer. Besides, one more CRISPR locus is 
identified in the distant genomic region. But only 5 spacers were 
found in the region and no cas or cmr genes were found in the 
region. Furthermore, putative CRISPR loci in the genomes of S. 
acidophilus and T. marianensis were also detected by CRISPRfinder 
[21]. Although we could also identify associated cas genes, which 
are required for viral defence in the two genomes, associated cmr 
genes were not identified in the two genomes. In contrast to S. 
thejinosidfidooxidans, the predicted CRISPR sequences of S. 
acidophilus and T. marianensis contain fewer spacer-repeat units. 
There are only 7 repeat/spacer sequences in S. acidophilus [3], and 
17 repeat/spacer sequences in T. marianensis [35]. Unexpectedly, 
only 2 spacers in S. thermosulfidooxidans show similarities to the 
spacers in T. marianensis genome, and none of spacers show 
similarity to the spacers in S. acidophilus genome. 

Energy Metabolism 

Sulfobacillus thermosulfidooxidans can grow mixotrophically by 
aerobic oxidation of ferrous iron, sulfur, and sulfide in the 
presence of organic compounds and concomitant fixation of 
inorganic carbon [36]. The oxidation and electron transfer 
pathways of ISCs are very complicated and various in different 
microbes, making their prediction and elucidation difficult [37,38]. 
Besides, some steps occur spontaneously, without enzymatic 
catalysis in the oxidation of ISCs. Previous studies of S. 
themwsulfidooxidans detected several enzymatic activities involved 



in the oxidation of ISCs, but the specific genes related with these 
activities were not identified [39,40]. Based on genome analysis, 
more than 30 genes encoding enzymes and electron transfer 
proteins predicted to be involved in the oxidation of inorganic 
sulfur compounds (ISCs) were detected in the genome (Table S4 in 
File SI). Genome-based model for ISC metabolism in S. 
thermosulfidooxidans is proposed (Fig. 2). Furthermore, semi-quanti- 
tative RT-PCR also indicated that all analyzed genes related with 
sulfur oxidation are expressed during growth on elemental sulfur. 

The oxidation of elemental sulfur. Sulfur oxygenase 
reductase (SOR) is considered a cytoplasmic enzyme oxidizing 
elemental sulfur in the cytoplasm in many sulfur-oxidizing 
bacteria, although it is largely unclear how suffur is transferred 
into cytoplasm. It can catalyze substrate sulfur into hydrogen 
sulfide, sulfite and thiosulfate [41,42,43]. In this reaction, 
elemental sulfur is both the electron donor and one of the two 
know acceptors, the other being oxygen. The enzyme is different 
from sulfur dioxygenase [44] and sulfur reductase [45], in that 
both activities are found together. However, recent study shows 
that the reaction catalyzed by SOR doesn't couple with the 
electron transfer chain or substrate-level phosphorv'lation in A. 
caldus. The predicted SOR protein in S. thermosulfidooxidam shows a 
characteristic domain of the SOR family (Pfam: PF07682) and all 
activity-required residues [46]. And the predicted protein shows 
72% similarity with the SOR in S. acidophilus. Semi-quantitative 
RT-PCR indicated that the expression of Sor gene is very high 
during growth on elemental sulfur, which suggests that the SOR in 
S. thermosufidooxidans is also involved in the oxidation of elemental 
sulfur. Surprisingly, the SORs in Sulfobacillus genus are assigned 
into the archaeal cluster and most closely related to a homologue 
in archaeal Ferrophsma acidarmanus which also survives in extremely 
acidic environments (Fig. 3.A). These suggest that Sulfobacillus 
genus may exchange the sulfur-oxidizing gene with other 
extremophUes sharing the similar niche. 

One of products from the SOR catalyzed reactions, hydrogen 
sulfide, is considered to be oxidized by sulfide quinone reductase 
(SQ_R). In this reaction, two electrons are transferred to the 
electron transfer chain by the quinone. In Gram-negative A. 
ferrooxidans, hydrogen sulfide is oxidized by SQR located in the 
cytoplasmic membrane [37]. One copy of iS(/rgene was identified 
in the genome of S. thermosulfidooxidans. Meanwhile, an orthologous 
gene that has the similar gene context to Sqr from S. thermo- 
sufidooxidans and shares 65% similarity was also identified in S. 
acidophilus. The conservation of sqr gene context in the genus 
Sulfobacillus is observed with A. ferrooxidans, which strongly suggests 
that sulfide quinone reductase also has the similar functional 
properties in Gram-positive Sulfobacillus genus. 

Another product from the SOR catalyzed reactions, thiosulfate, 
may be catalyzed by thiosulfate sulfurtransferase (TST) or 
rhodanese. Five genes encoding thiosulfate sulfurtransferase 
(TST) or rhodanese are dispersed in S. thermosufidooxidans genome. 
The presence of the rhodanese motif associated with ubiquitin C- 
terminal hydrolases and phosphatases makes these enzymes 
possess the potential of sulfur oxidation [4 7]. They are proposed 
to transfer a sulfur atom from thiosuffate to sulfur acceptors like 
thiol proteins (RSH) with the production of sulfite in A. caldus [48] . 
Subsequently, the product from the TST catalyzed reactions, 
sulfane sulfate (RSSH), is proposed to be the substrate of 
heterodisulfide reductase (HDR). heterodisulfide reductase is three 
subunit complex HdrABC that can catalyze the oxidation of 
RSSH to regenerate RSH with coupling with the electron transfer 
chain. Three copies of HdrABC operon have been identified in S. 
thermosulfidooxidans (Table S4 in File SI). One copy of HdrABC 
operon has a high similarity with that of S. acidophilus, while the 
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Figure 2. Genome-based models for the oxidation of inorganic sulfur compounds (ISCs). Schematic representation of enzymes and 
electron transfer proteins involved in the oxidation of ISCs. Electrons from SQR, TQR, HDR are transferred to the electron transfer chain by the 
quinone, then are used by NADH complex I to generate reducing power or by terminal oxidases bd or bo^ to form a proton gradient. Abbreviations: 
TTH, tetrathionate hydrolase; SQR, sulfide quinone reductase; TQR, thiosulfate quinone oxidoreductase; SQR, sulfur oxygenase reductase; SAT, sulfate 
adenylyltransferase; HDR, heterodisulfide reductase; Omp, outer membrane protein. 
doi:1 0.1 371/journal.pone.009941 7.g002 



other two copies of HdrABC operon show more divergences with 
that of S. acidophilus. But the conserved family domains and all 
activity-required residues [49] were identified in three HdrABC 
operons. Furthermore, semi-quantitative RT-PCR showed that 
HdrABC genes are highly expressed in growth on elemental sulfur. 
These results suggest that HDR is also involved in the oxidation of 
elemental sulfur in the Sulfobacillus genus. 

The sulfite oxidation pathway. The known periplasmic 
enzymes (sorAB or soxCD) involved in the direct oxidation of 
sulfite [50,5 1] were not identified in both S. themiomlfidooxidans and 
S. acidophilus. An oxidoreductase molybdopterin binding protein 
was identified as a possible sulfite oxidase, but the putative protein 
only harbors a molybdopterin domain and a twin-arginine 
translocation pathway signal sequence without dimerization 
domain and N-terminal heme domain (Table S4 in File SI). 
Furthermore, no additional subunit containing heme domain that 
is required for sulfite oxidase was identified in S. acidophilus and S. 
thermosulfidooxidans. Therefore, subsequent oxidation of sulfite is 
most likely to occur in the cytoplasm (Fig. 2). The most possible 
way for sulfite oxidation is that sulfite is converted to adenosine-5'- 
phosphosulfate (APS) and then oxidized to sulfate via an indirect 
pathway controlled by APS reductase and sulfate adenylyltrans- 
ferase which is similar to the sulfite oxidation pathway in A. 
ferrooxidans. A putative sulfate adenylyltransferase gene was 
discovered in S. thermosulfidooxidans, although no candidates with 
significant similarity to an orthologous gene of APS reductase were 
found in the draft genome of S. thermosulfiidooxidans. An enzyme 
catalyzing the sulfite to APS is required, if sulfate adenylyltransfer- 
ase indeed catalyzes APS to sulfate. In A. ferrooxidans, the missing 
function is postulated to be accomplished by the hypothetical gene 
embedded in the hdrlocai of sulfur oxidizers [37]. The conserved 
hypothetical gene was also found in the hdr locus of S. 
thermosulfidooxidans genome (Table S4 in File SI). 



The S4I pathway. The gene encoding tetrathionate hydro- 
lase (TTH) was also identified in .S*. thermo.mlfidooxidans, which is 
thought to be involved in the hydrolysis of tetrathionate to 
generate sulfur, sulfate and thiosulfate. And the activity of this 
enzyme has been studied in Acidithiobacillus genus [52,53]. The 
TTH of S. thermosulfidooxidans shows 5 7 % similarity with that of A. 
caldus and has a conserved pyrrolo-quinoline quinone domain. 
Previous experimental data showed that the A. caldus TTH is a 
soluble periplasmic protein with maximum activity at pH 3.0 [54]. 
Furthermore, doxDA genes present in S. thermosufidooxidans genome 
are predicted to encode a thiosulfate/quinone oxidoreductase 
(TQR). Orthologous genes of doxDA were also detected in S. 
acidophilus, which supports that the enzyme in Sulfiobacillus genus has 
the similar functional properties. In our model, the TQR is 
proposed to catalyze thiosulfate to tetrathionate and transfer two 
electrons to the quinone. The consecutive reactions catalyzed by 
TTH and TQR promote the sulfur oxidation in the periplasm of 
S. thermosulfidooxidans. 

The iron(II) oxidation. In extremely acidic environment, 
the most detailed account of Lron(II) oxidation pathways is 
available for the Gram-negative bacterium A. fierrooxidans[9]. The 
model for iron oxidation in A. ferrooxidans is related to two 
transcriptional units, the petl and rus operons [55] . However, the 
genome of S. thermosulfidooxidans doesn't contain petl and rus 
operons, and these genes are not discovered in S. acidophilus either. 
Unexpectedly, a gene encoding the blue copper protein sulfocya- 
nin was found in the S. thermosulfidooxidans genome. The predicted 
protein in S. thermosulfidooxidans shows a characteristic domain of 
sulfocyanin (Pfam: PF06525). Sulfocyanin, sharing sequence 
characteristics with A. ferrooxidans rusticyanins, is proposed to 
transfer electrons during iron oxidation in acidophilic archaea 
Ferroplasma .^pp. [56,57]. Besides, semi-quantitative RT-PCR 
indicated that the sulfocyanin gene in S. thermosulfidooxidans is 
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highly expressed during growth on ferrous sulfate. Thus, it is 
possible that the predicted sulfocyanin is a component of the 
electron transport chain in iron oxidation of S. thermosulfidooxidans, 
although more details about ir(m(II) oxidation pathways are still 
unknown. Interestingly, phylogenetic analysis revealed that the 
protein is assigned to the archaeal cluster (Fig. 3.B), which makes 
the origin of sulfocyanin in S. thermosulfidooxidans elusive. 

Electron Transfer Chain 

S. thermosulfidooxidans encodes a fairly complete respiratory chain 
consisting of complexes 1-5, which is necessary for energy 
generation and reverse electron transport (Table S4 in File SI). 
Three cydAB copies encoding subunits of cytochrome bd complex 
and four gene clusters that code for aa^-type terminal oxidase were 
detected in the draft genome. The analysis also revealed that S. 
thermosulfidooxidans has 16 genes encoding all subunits of type I 
NADH dehydrogenase (subunit F of which has three copies). In 
consistent with other members of the Sulfobacillus genus, S. 
themosulfidooxidans also lacks most components of electron transfer 
chain in the iron oxidation, only Cyt aa-;, a subunit of the 
cytochrome complex hc-^ and cytochrome c present. However, 
numerous other genes putatively involved in electron transfer 
chain were found in S. thermosulfidooxidans genome (Table S4 in File 
SI). The gene redundancy may provide regulatory flexibility to 
confront environmental changes such as nutrient deficiency and 
different substrate phosphorylations. 

Central Carbon Metabolism 

S. thermosulfidooxidans encodes all key genes of the Calvin cycle 
carbon fixation pathway (Fig. 4; Table S4 in File SI). Comparing 
with S. acidophilus genomes [3] , S. thermosulfidooxidans only contains a 
gene cluster encoding form I ribulose bisphosphate carboxylase 
and lacks homologies of form II ribulose bisphosphate carboxylase. 
Form I and II ribulose bisphosphate carboxylases are regulated to 
adapt to environmental conditions with different levels of CO2 in 
Hydrogenovibrio marinus [58]. It was reported that S. thermosulfidoox- 
idans could grow autotrophically at the CO2 content of the 
supplied air to 5-10% [59]. Thus, the complete Calvin cycle 
confirms that carbon dioxide can be the major source of carbon 
for S. thermosulfidooxidans. 

In many organisms, the 3-phosphoglyceraldehyde generated by 
CO2 fixation via the Calvin cycle enters the glycolysis/gluconeo- 
genesis pathways [47,60]. The genes identified for the pathways in 
S. thermosulfidooxidans with their reactions and potential intercon- 
nections with other biosynthetic pathways are presented (Fig. 4). 
Fixed carbon can be channeled in either of two directions: for 
glycogen biosynthesis, or to provide carbon backbones for 
anabolic reactions. For glycogen biosynthesis, S. thermosulfidooxidans 
contains a gene that are predicted to encode fructose biphosphate 
aldolase (EC: 4.1.2.13), which catalyzes the formation of fructose- 
1, 6-bisphosphate. And a fructose biphosphatase gene was 
identified in S. thermosulfidooxidans genome. However, in accor- 
dance with all published Sulfobacillus genomes, these genomes lack 
orthologous genes encoding three key enzymes for glycogen 
biosynthesis: (i) glucose- 1 P-adenylyltransferase; (ii) glycogen syn- 
thase; (iii) 1, 4-alpha-glucan-branching protein. Moreover, S. 
thermosulfidooxidans also lacks glycogen phosphorylase that is 
involved in decomposing glycogen, thus regenerating glucose- IP 
from the non-reducing terminus of glycogen [61,62]. These results 
suggested that glycogen may not be the main carbon-stored 
substance in S. thermosulfidooxidans. 

S. thermosulfidooxidans harbours candidate enzymes for all steps of 
an oxidative TCA (Fig. 4 and Table S4 in File SI). The conversion 
of 2-oxoglutarate to succinyl-CoA is flexible in S. thermosulfidoox- 



idans. It can be catalyzed directly by 2-oxoacid: ferredoxin 
oxidoreductase (EC: 1.2.7.3) or continuously catalyzed in two 
steps by 2-oxoglutarate dehydrogenase complex. Moreover, it is 
notable that S. thejinosulfidooxidans seems more versatile in the 
production and the conversion of central metabolite pyruvate 
(Table S4 in File SI). Except for one copy of malate dehydroge- 
nase (EC: 1.1. 1.37), the genome also has two genes with homology 
to malic enzyme (EC: 1.1.1.38) that catalyzes the reversible 
conversion of malate to pyruvate and thus is involved in 
gluconeogenesis and anaplerosis. Furthermore, S. thermosulfidoox- 
idans encodes an alanine dehydrogenase (EC: 1.4.1.1) that 
catalyzes the reversible deamination of alanine to pyruvate (Table 
S4 in File SI). Alternatively, alanine formation from pyruvate 
catalyzed by this enzyme might not only be important for protein 
biosynthesis but could also have a function in ammonia storage 
and ammonia toxicity alleviation. On the other hand, alanine 
might thus represent an important source of pyruvate for S. 
thermosulfidooxidans, as it possesses various amino acid and oligo/ 
dipeptide transporters (Table S5 in File SI). 

In S. thermosulfidooxidans, the process of pentose sugar synthesis is 
very flexible. S. thermosulfidooxidans possesses a functional oxidative 
pentose phosphate pathway (Fig. 4 and Table S4 in File SI), which 
can be used for the generation of pentose sugars in many bacteria 
and archaea [63,64,65]. Meanwhile, homologues of key genes for 
the non-oxidative pentose phosphate pathway could also be 
identified in S. thermosulfidooxidans as well. Specifically, S. thermo- 
sulfidooxidans encodes a 6-phosphogluconolactonase and two 
glucose-6-phosphate 1 -dehydrogenases that catalyze the intercon- 
version of hexose and pentose (Table S4 in File SI). Furthermore, 
two genes coding for 6-phosphogluconate dehydrogenase, a key 
enzyme that bridges the oxidative and non-oxidative part in 
pentose phosphate pathway, were identified, one of which is 
located in proximity to two other genes putatively involved in the 
pentose phosphate pathway, a transketolase and a bifunctional 
transaldolase/phosoglucose isomerase. The bifunctional transal- 
dolase/phosoglucose isomerase is found as a fused protein 
harboring one transaldolase domain and one glucose-6-phosphate 
isomerase domain in many bacteria [66]. It is noteworthy that a 
gene encodes a mon-fiinctional glucose-6-phosphate isomerase is 
also identified in the genome (Table S4 in File SI). 

Transport and Resistance 

The genome of S. thermosulfidooxidans encodes at least 230 
putative transporter proteins (Table S5 in File SI), which are the 
structural elements of approximately 90 transport systems (some of 
them consisting of several proteins) and represent 50 transporter 
families[22,67,68]. Furthermore, all important components of the 
general secretion (Sec) and twin-arginine translocation (Tat) 
pathways have been identified in the genome (Table S4 in File 
SI). The complement of transport systems in S. theimosulfidooxidans 
is roughly reminiscent of other Clostridiales family XVII incertae sedis 
but more than 80 transport proteins have best hits without 
Clostridiales family XVII incertae sedis. 

Among the S. thermosulfidooxidans transport proteins, those 
belonging to the ATP-binding cassette (ABC) superfamily are 
the most represented (n = 87), which might be involved in the 
uptake of organic molecules. Besides, more than 15 transport 
systems composed of ATP-binding cassette (ABC) superfamily 
proteins responsible for special organic molecules are also 
identified. These complement the higher flexibility of S. thermo- 
sulfidooxidans in the central carbon metabolism. In addition, S. 
thermosulfidooxidans also harbors an oligo/ dipeptide transport system 
as well as several aminopeptidases (leucyl- and methionyl- 
aminopeptidase), which could be used to release amino acid from 
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Figure 4. Predicted central carbon metabolism of 5. thermosulfidooxidans (see also Table S4 in File S1 for further details on 
respective EC numbers and annotation classification). Enzymatic reactions for which candidate genes can be identified in the genome of S. 
thermosulfidooxidans are highlighted by solid arrows. The reactions associated with other metabolic pathways are shown with pink arrows. The 
transmembrane transports of small organic compounds that may directly enter central carbon metabolism are also presented. 
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the imported peptides. The uptake sy.stem of amino acids or 
peptides may reduce the energy cost of protein synthesis, and 
could also serve for acquisition of substrates for anaplerotic 
reactions. 

Although the mechanism of transport of extracellular S to the 
cytoplasm is not clear, several candidates have been proposed to 
play important roles for the S" transport in green sulfur bacteria 
[41,69]. One possibility is that the thioredoxin SoxW acts together 
with thiol-disulfide interchange protein DsbD within the periplasm 
in transferring S" across the inner membrane [70]. One gene is 
identified that potentially codes for a DsbD protein in the draft 
genome (Table S5 in File SI). No significant homology of SoxW 
has been found in S. thermosulfidooxidans, however, numerous 
thioredoxin genes are identified in S. thermosulfidooxidans and they 
possibly can perform the same function as SoxW. Furthermore, S. 
thermosulfidooxidans also possesses at least four sulfonate/nitrate/ 
taurine transport systems. These systems can transport sulfonate 
compounds into the cytoplasm, and the sulfur in sulfonate 
compounds may finally be oxidized as energy source. 

S. thermosulfidooxidans also possesses, like other Sulfobacillus 
members, various resistance systems including more than 30 
putative metal ion efilux proteins belonging to 13 different 
transporter families (Table S5 in File SI). Most of transport 
systems responsible for metal ion efflux belong to ABC transporter 
superfamUy. Except for these, we also find some oxidoreductases 
that are closely related with metal resistances (Table S5 in File SI). 
For example, S. thermosulfidooxidans encodes a mercuric reductase. 



which belongs to a FAD-containing flavoprotein and can reduce 
Hg2+ to Hg" utilizing NADPH [71,72]. The reduction of Hg^^ is 
an important step of mercuric resistance. As for arsenic resistance, 
a remarkable feature is the presence of two arsC genes coding for 
arsenate reductase not previously described in S. thermosulfidooxidans 
[73,74]. The enzyme arsenate reductase is required to confer 
resistance to As(V) for organisms, since the non-enzymatic 
reduction of As(V) is too slow to be physiologically significant 
[75], And the resulting As(III) can be pumped out of the cells by 
the ArsA/ArsB ATPase [76]. Surprisingly, phylogenetic analysis 
showed that one of arsenate reductase is most closely related to a 
homologue in the thermophilic bacterium Thermaerohacter subterra- 
neus but the other is most closely related to a homologue in the 
acidophilic thermophilic bacterium Alicyclobacillus acidocaldarius 
(Table S5 and Fig. S3 in File SI). These indicate that S. 
thermosulfidooxidans may obtain new arsenic resistance capacities 
from other extremophUes sharing a similar niche. In addition, S. 
thermosulfidooxidans has three antibiotic transporter systems (Table 
S5 in File S 1), which is absent from other Sulfobacillus genomes. It is 
tempting to speculate that the unique ecological niche makes S. 
thermosulfidooxidans obtain new antibiotic resistances. 

Conclusions 

Comparative genome analysis of S. themiosufidooxidans genome 
revealed that the gene content for sulfur oxidation is similar to 
other sulfur-oxidizing acidophiles, but also revealed some features 
not yet found in other acidophiles. A novel sulfur oxygenase 
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reductase is suggested to play a key role in the sulfur oxidation of 5. 
thermosulfidooxidans. It can catalyze substrate sulfur into hydrogen 
sulfide, sulfite and thiosulfate [41,43]. Although the iron oxidation 
is still unclear, the predicted sulfocyanin is proposed to be an 
important component of the electron transport chain in the iron 
oxidation of S. thermosulfidooxidans, as it happens in acidophilic 
archaea Ferroplasma spp [56] . In addition, S. thermosulfidooxidans has 
more flexibility in the central carbon metabolism including two 
pentose phosphate pathways, flexible conversion of the central 
metabolite pyruvate and the ability to metabolize various organic 
compounds. However, glycogen may not be used as a substance of 
energy source in S. thermosulfidooxidans. It also possess(;s numerous 
transport systems of organic compounds including multiple sugars, 
oligopeptide/dipeptide, maKc acid, and various amino acids. 
These transport systems complement the higher flexibility of S. 
thermosulfidooxidans in the central carbon metabolism. Furthermore, 
it also encodes an impressive collection of resistance proteins that 
win provide a surviving advantage for living in the acid hot spring 
containing high concentration of various heavy metals. The 
physiological versatility of S. thermosulfidooxidans might be an 
essential factor for the competitive success in the extreme acidic 
environment. 

Supporting Information 

File SI Figures SI, S2, and S3 and Tables SI, S2, S3, S4 
and S5. Figure SI. A. Phylogeny of Clostridiaks family X\'II 
incertae sedis including S. thermosulfidooxidans. Phylogenetic tree based 
on the 16S rRNA gene was constructed with the neighbor-joining 
method. The scale bar represents the number of nucleotide 
substitutions per site. B. Electron micrograph of S. thermosulfidoox- 
idans shows its morphology. Figure S2. Venn diagrams 
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